Language/OS - Multiplatform Resource Library

home *** CD-ROM | disk | FTP | other *** search

/ Language/OS - Multiplatform Resource Library / LANGUAGE OS.iso / icon / contrib / treefox.lha / TREE&FOX.DOC < prev next >

Wrap

Text File | 1993-04-24 | 64.0 KB | 1,486 lines

TREE & FOX Explorations in computer aided natural language analysis Manfred Jahn English Department University of Cologne 1993 TREES A program to create graphs and phrase markers. TREECAD A utility for generating, manipulating and exploring X-bar trees, transformations and cross-language variation. FOX A "Frame Oriented X-bar Parser" which parses sentences in interactive or automatic mode. TREE & FOX documents three PC-based programs aimed at processing linguistic data structures. The programs run under non-386 MS-DOS ICON (from version 8.0). Due to the experimental and provisional nature of the programs the author makes no warranties of any kind as to their robustness or suitability for any application. Notes: ====== (1) In this file, asterisks (*) are used to mark strings that are italicized in the original printed output (available from the author). (2) If you print this document, make sure to set a nonproportional font such as Courier or Letter Gothic. 1. TREES - a tree drawing utility 1.1. Structural representations. The basic data type of syntactic analysis is the directed graph. There are two common representations: labelled bracketings and trees. Labelled bracketing quickly tends to become obscure even with trees of moderate complexity. A tree is a far superior representation, but it takes up more space and is expensive to print. Consider the following representations: a. [NP [Detthe] [Nbar [APvery lucky] [Nbar [Ngirl] ] ] ] [category identifiers appear as subscripts in the original printing] b. (NP,(Det,the),(Nbar,(AP,very lucky),(Nbar,(N,girl)))) c. NP ┌─────┴──────┐ Det Nbar │ ┌───┴─────┐ │ AP Nbar │ │ │ │ │ N │ │ │ the very lucky girl (a) is a type of labelled bracketing frequently found in linguistic textbooks. (b) is a straightforward mapping of (a) into a plain string format. Directly or indirectly, it serves as the basic input data structure for all of the utility programs introduced in this report. Note that in the representation of (b) each nonterminal category is any label after an opening bracket (i.e., NP, Det, Nbar etc.), and any item preceded by a comma is a terminal item. (c) has been generated from (b), and it is clearly the most easily comprehensible representation of the structural relationships involved. 1.2. Input/Output. Input for program Trees comes from a plain text file called trees.in. This file can contain any number of "tree plans" in the formats specified below. The trees generated from these plans are displayed on the screen and saved to an output file called trees.out. Two input formats may be used: a. Single lines of labelled bracketing (as in 1.1b). b. A sequence of lines with indentations representing the tree structure: NP Det the Nbar AP very lucky Nbar N girl The root category must appear in column 1. Subordinate levels are indicated by progressive indentations of two spaces. Phrases consisting of several words (e.g., "very lucky") are acceptable node labels. Do not use round brackets or commas. Most higher ASCII characters (particularly 250 and up) should also be avoided. c. Successive tree plans must be separated by one or more blank lines. Lines beginning with a hash character (#) are treated as comments. Use a file lister to view some sample plans in trees.in. 1.3. Invoke the program with the command line iconx trees. The following parameters are then requested by the program: a. Terminal nodes on baseline or *in situ*. b. The depth of the tree (default is 8). Since Trees is a small program and memory is the only limitation, trees can be built to considerable depths. c. Optional: Tab offset and increments - see below. 1.4. Postediting for proportional fonts. The output trees will display correctly on the computer's text screen or if printed with a monospaced (nonproportional) typeface. To a certain extent, Trees can provide some support towards proportionally spaced output such as the following: [sorry, unable to display this here; refer to original printed text] Unfortunately, this type of output requires a certain amount of postediting. To begin with, your printer must have a monospaced typeface and a proportional typeface of roughly the same dimensions. Test this by printing or previewing a couple of sample lines with several typefaces. On HP type printers, viable combinations include Courier 16.67/Times Roman 10 pt and Courier 12/Times 12pt. The following notes assume the Courier 12/Times 12 configuration. The basic idea is to superimpose proportional node labels on to a monospaced scaffold of pseudographics lines. Under WordPerfect 5.1, this involves the following steps: a. In WordPerfect, set the monospaced font (Courier 12). Also, via the Setup option (Shift-F1,3,8), select a small unit of measurement for the position display, preferably point sizes (pt). As you can see on the status line, a left margin of 1 inch is equivalent to a horizontal offset of 72 pt. Type one space, and under Courier 12 the cursor will move in increments of 6 pts. Verify this on the status line. b. Run trees. At the "Calculate Tabstops" prompt press any key except ENTER. The program will now ask for an increment value, the left margin setting, an indent factor and a proportional adjustment. The defaults suggested by the program are 6, 72, 1 and 4, which happens to be right for 10 c.p.i Courier/Times 12 pt. (Set 4.32, 72, 1 and 2.6 for Courier 16.67/Times 10.) The indent value can be used to move the tree towards the middle of the page. The proportional adjustment varies with different typefaces and has to be determined by trial and error. c. Trees produces the following output: Branches in columns: 12 18 21 25 31 Tabstops from Margin offset: 10; by Increments: 6 Set center Tabs at: 142 178 196 220 256 NP 142 178 220 ┌─────┴──────┐ Det Nbar 142 196 220 256 │ ┌───┴─────┐ │ AP N 142 178196 220 256 │ ┌──┴───┐ │ │ │ Abar │ 142 178 220 256 │ │ │ │ │ │ A │ 142 178 220 256 │ │ │ │ a very lucky girl The figures flush to the branches provide a visual cue (needed for step f, below) as to which edges are associated with which tabstops. d. Back in WordPerfect, set the monospaced font (Courier 12) and the standard fixed line height appropriate for this font. Import the tree (from trees.out) together with its list of tab stop positions. e. Move below the imported list of tab stop positions. Load the Tabs Menu. Make sure that the tab type is "absolute". Set the tabs; first by clearing them (Ctrl-End), then by entering the values provided by trees. By default, WordPerfect sets "left align" (L) tabs. You need center tab stops, however, so simply place the cursor on the newly created "L"s in the Tabs line and change them to "C"s. Exit the tabs menu. f. Scroll past the monospaced tree and set the Times 12 font. Enter the text of the nodes tabbing to the proper branch positions indicated in the tree and leaving blank lines as needed. What you want is an image of the tree consisting of the labels of the nodes only, in their proper positions, but without any of the pseudographics. Additional features such as bold, underlining, superscripting, italicizing etc. can all be set, now or later, providing considerable flexibility. g. Move back to the monospaced image of the tree. Turn on "type- over" mode and, using spaces, overwrite all monospaced text, including the tab position cues, until only a bare scaffold of pseudographics lines remains. Turn off typeover mode when finished. h. Make a note of the line position of the first line of the monospaced tree. Superimpose the Times Roman section by calling WordPerfect's "Advance to Line" (Shift-F8,4,1,3) function. Switch over to previewing mode to check alignment, and correct any mistakes. i. Once you have mastered the basic technique, it is worth considering putting all trees into "text boxes" which in WordPerfect define their own set of tab stops and, more importantly, have an independent line positioning feature which is not affected by any editing changes in the main text. Since tab settings in text boxes are calculated relative to a user-specified horizontal position, trees should be instructed to calculate tab stops from a left margin of zero. 2. TREECAD - Designing structural trees 2.1. Basic requirements. TreeCad runs on 386/486 based PCs with a VGA or EGA screen (no Hercules mode), a mouse and a hard disk. Operation without a mouse or with lesser processors is possible, but rather tedious. The program needs as much ordinary memory as can be made available. TreeCad can either be run from the DOS prompt or under Windows 3.1. in 386 mode. 2.2. Uses. TreeCad is a text-and-pseudographics based utility for generating, displaying and manipulating tree structures of all kinds. It is therefore especially suited to: o constructing and editing arbitrary trees. Special support is provided for Xbar structures. o hilighting structural relationships such as c-command, m-command and government. o demonstrating and exploring adjunction and movement patterns. 2.3. Operation under DOS. a. Program invocation: iconx TreeCad iconx TreeCad nomode [see "switches", below ] iconx TreeCad 7,15,23,59,69,120 nomode TreeCad.icx can only be run from an ordinary (non-386) DOS version of ICON. b. Two optional switches may be set: - nomode [prohibits switching into 80,43 mode] The program attempts to switch into 80,43 mode as soon as the system.max variable (i.e., the tree depth) is set to a value larger than 11. Mode switching may not work for a number of reasons (e.g., insufficient memory, idiosyncratic mode commands etc.). In this case, execute a suitable mode command and invoke TreeCad with the nomode switch. - n1,n2,n3,n4,n5,n6 [colours] These are six colour attribute numbers in the range 0-255. Default settings (for an ordinary DOS screen) are 7,15,78,14,6,120. If you are displaying the screen via an LCD connected to an overhead projector you may find that some of the colours do not reproduce effectively. In this case, run the ATTRS.EXE utility to determine adequate attributes and invoke TreeCad with the new values. The attributes are used in the main menu's show group: n1 is the standard normal attribute; n2 is the general hilight attribute; n3 is the hilight attribute for a c-commanding constituent; n4 is the hilight attribute for a c-commanded item; n5 tones down items which are not commanded; and n6 hilights a governed item. c. Support programs. SCROLLER.EXE and the corpus file, treecad.in, must be present for the data.corpus command to work. Treecad.in is an editable textfile containing a selection of trees in labelled bracketing format. If the PC is short on memory, TreeCad may not be able to run the SCROLLER. In this case, TreeCad can only be used in its scratch mode. Note that the SCROLLER is a standalone program which is restricted to handle a maximum of 100 lines restricted to a maximum length of 255 characters. d. Other files. The corpus item selected via the SCROLLER program is fed into scroller.dsk which is consulted when the data.corpus option is activated from TreeCad. During normal operation of TreeCad, all trees generated are saved in a protocol file called treecad.tmp. When the verbose option is ON, all diagnostic output is fed both to the screen and to treecad.tmp. Treecad.tmp is overwritten each time TreeCad is started. 2.4. Operation under Windows 3.1. TreeCad is not a proper Windows program. However, in Windows 3.1 386 mode it can be run as a "non- windows application in a window". This has a number of advantages such as access to the black-on-white screen, a smoothly moving arrow-shaped mouse pointer, resizable system fonts, data exchange via the clipboard and inclusion of explicatory text in a separate window. Do the following steps to set up TreeCad for Windows: a. Copy TreeCad.ico and TreeCad.pif, i.e., the icon and program information files, to your main Windows 3.1 directory. b. In Windows, start the PIF-Editor. Click *File/new*. Click *browse* to locate and select TreeCad.pif. The only entries requiring any change are the lines specifying the ICON directory. Adjust this to whatever directory you are using for your TREE&FOX files. Exit the PIF-Editor, saving the changes. Invoke the Program Manager's *File/new* menu and OK the box *program item*. Enter "Treecad.pif" as *commandline* text. Click *change icon*. Disregard the error message and click OK. Use *browse* to locate treecad.ico. Select it and click OK. The TreeCad icon will appear among the Program Manager's other program symbols, and TreeCad is ready to run. c. Some further hints: - There is an option to adjust font sizes in TreeCad's system menu field. The most suitable font sizes are 8x12 and 7x12. - The "edit" option lets you copy all or part of your TreeCad display to the clipboard. d. Notice: TreeCad may crash when the system.max variable is set to a value larger than 10. This may be due to a lack of memory or the fact that no ANSI.SYS driver is presently specified in the CONFIG.SYS file. For large values of max (i.e., 11..15), make sure to resize the window and select a suitable font. 2.5. Initial menu. The initial screen presents three button groups: data system action ┌──────┼───────┐ ┌──────┬─────┴─────┬──────┐ ┌────┼──────┐ │corpus│scratch│ │max=10│verbose=OFF│tree=b│ │quit│resume│ a. Data. - The corpus option runs SCROLLER.EXE. This program lists the contents of the corpus file treecad.in and allows trees to be imported. - The scratch option presents two major Xbar structures (a CP and an IP subtree) as initial experimental structures. b. System. This group provides three buttons to change defaults. - max is the number of tree levels to be displayed. If it is set to greater than 11 (and the nomode switch is not in force) TreeCad makes an attempt to execute "mode 80,43", giving access, in theory, to a 43 line screen. Actually, critical values for max begin at around 15, when TreeCad runs out of memory and therefore crashes. Nothing serious happens, control simply returns to DOS or Windows. - verbose ON instructs the program to dump all debugging writes to treecad.tmp. Under ordinary circumstances, it should remain OFF. - tree toggles the display of the trees to either the baseline or the *in-situ* format. c. Action. This is either quit or resume. The main use of resume is to reload previously edited structures after having reset one of the system variables. 2.6. The main menu consists of four groups. The show group highlights Xbar relationships. The ops group handles Xbar operations. The edit group contains a range of editing tools that allow various tree manipulations such as cutting and copying. The system group has options to undo, redo and save steps and also provides the quit button, which returns control to the initial screen. All options appear as idiosyncratic two or three letter strings: show ops edit system ┌──┬──┬─┴┬──┬──┐ ┌───┼───┐ ┌───┬───┬─┴─┬───┬───┐ ┌──┬──┼──┬──┐ │hi│cc│mc│gv│Gv│ │adj│mov│ │cpy│cut│gen│mir│ren│ │un│Re│sv│qu│ Note that the following documentation of the individual options has been arranged so as to provide a step by step tutorial as well as a reference guide. 2.7. Keyboard-based input. For users without a mouse, the following guidelines apply. 1) In order to execute a "click on option/button XYZ", type in *two* option letters and press ENTER or the space bar. 2) In order to "click a node", hit PgUp. Then, using the cursor keys, navigate the cursor to the beginning of a node label and press ENTER. Ctrl-Left and Ctrl-Right jump to the beginning of the next word on the left or on the right. Whenever literal keyboard input is requested, the space bar (as well as ENTER) serves as a terminator - this is very convenient for entering input strings with one hand only. However, it also means that you cannot enter strings containing spaces. 2.8. If you want to follow the examples, start the program and set the max variable to four or five. Select data.scratch in the initial menu. TreeCad will present the following basic configuration: CP IP ┌──┴────┐ ┌───┴────┐ │ Cbar │ Ibar │ ┌─┴───┐ │ ┌──┴────┐ │ │ │ │ │ VP │ │ │ │ │ │ │ │ │ │ │ Vbar │ │ │ │ │ ┌─┴───┐ CSp C IP NP I V NP 2.9. The edit group. a. *cut* deletes nodes and subtrees. Click on cut. The option will be hilighted and the prompt "CUT<node>" appears. Click a terminal node, and it will be pruned from the tree. After deleting an item, cut remains hilighted and active. If you want to continue lopping off branches just continue clicking other nodes you want removed. Clicking a nonterminal node such as Cbar will delete both the parent and the daughter nodes. Clicking a root node will delete a whole tree. Use system.un (see 2.10.a, below) to undo steps, if necessary. Of course, you can also always quit and restart from scratch. As an exercise, cut the scratch configuration down so that only the CP tree remains. b. ren (rename) affects only node text and does not alter any structural relationships. Click on ren, then click on CP and rename it to A. Continue traversing the tree changing CSp to B, C to D and IP to E, eventually obtaining the following tree (remember that you can undo steps if you make a mistake): 1. CP 2. A ┌──┴────┐ ┌──┴────┐ │ Cbar │ C │ ┌─┴───┐ │ ┌─┴───┐ │ │ │ │ │ │ CSp C IP B D E RENAME<node> CP TO: A RENAME<node> CSp TO: B (etc.) c. mir (mirror) exchanges peripheral daughter nodes. Click mir, then A (B and C will change places): 1. A 2. A ┌──┴────┐ ┌───┴─────┐ │ C C │ │ ┌─┴───┐ ┌─┴───┐ │ │ │ │ │ │ │ B D E D E B MIRROR<nonterminal>: A Click C (D and E will change places). Click C again (D and E will revert to their original positions. Click A again to reconstitute the original tree. Mir has no effect if activated on a terminal node or a parent with only one daughter. d. cpy (copy) copies trees, subtrees or terminals to a destination node, replacing the destination. Click cpy and node C. Then click B as the destination. 1. A 2. A ┌──┴────┐ ┌────┴──────┐ │ C C C │ ┌─┴───┐ ┌─┴───┐ ┌─┴───┐ │ │ │ │ │ │ │ B D E D E D E COPY<node>: C TO: B Another major function of cpy is to create independent trees or subtrees on the left or right periphery of the tree display space. If the destination click occurs on an empty space in column 1, an independent tree is created on the left. If the destination click falls on an empty space in column 2-79, the copy is created on the right. 1. A 2. C A ┌──┴────┐ ┌─┴───┐ ┌──┴────┐ │ C │ │ │ C │ ┌─┴───┐ │ │ │ ┌─┴───┐ │ │ │ │ │ │ │ │ B D E D E B D E COPY<node> C TO: [click at column 1, row 1] e. gen (generate) is a tool for creating a variety of tree structures. You begin by clicking a destination position which may be any node of a given tree or an empty space in column 1 or an empty space in column 2-79 (this is the identical convention as for cpy.) Then you either specify a head of an Xbar structure or a list of daughter nodes, or reconstitute an item from a previous cut. To obtain tree #2, below, activate gen, click node B and enter "N". An NP subtree is generated and replaces B. *Any* letter X entered at gen's "TO" prompt is understood to indicate the head of an Xbar structure. Gen creates standard Xbar structures, containing a maximal projection (XP), a specifier (XSp) and a complement (YP). 1. A 2. A ┌──┴────┐ ┌───────┴────────┐ │ C NP C │ ┌─┴───┐ ┌──┴────┐ ┌─┴───┐ │ │ │ │ Nbar │ │ │ │ │ │ ┌─┴───┐ │ │ B D E NSp N YP D E GENERATE<node/pos> B TO<head/paste/list>: N At the TO prompt, the user can also enter a list of comma- delimited node labels prefixed by the space character. The new nodes will then become daughter nodes under (i.e., added to) the destination node. 1. A 2. A ┌──┴────┐ ┌────┴──────┐ │ C B C │ ┌─┴───┐ ┌─┴───┐ ┌─┴───┐ │ │ │ │ │ │ │ B D E xx yy D E GENERATE<node/pos> B TO<head/paste/list>: [space]xx,yy If, at the <node/pos> prompt, an empty space is clicked either in column 1 or column 2-79, an independent Xbar tree is generated to the left or right of the current tree space: 1. A 2. A VP ┌──┴────┐ ┌──┴────┐ ┌──┴────┐ │ C │ C │ Vbar │ ┌─┴───┐ │ ┌─┴───┐ │ ┌─┴───┐ │ │ │ │ │ │ │ │ │ B D E B D E VSp V YP GENERATE<node/pos>[click col. 60, row 1] TO<head/paste/list>: V Finally, a previously cut item may be resurrected in a different location by entering space+ENTER, which inserts the current contents of the paste buffer. In the following example, the C- subtree was first cut and then pasted into the position of node B. 1. A 2. A ┌──┴────┐ │ │ C C │ ┌─┴───┐ ┌─┴───┐ │ │ │ │ │ B D E D E CUT<node>: C GENERATE<node/pos> B TO<head/paste/list> [space][ENTER] 2.10. The system group provides four options: a. un (undo) undoes the last operation. A maximum of five steps can be undone. b. Re (redo) redoes a step previously undone. A maximum of three steps can be saved in the redo buffer. Thus if you have just undone five steps, you will only be able to return to the antepenultimate stage. c. The sv (save) option allows you to append the current tree to treecad.in, making it permanently available to selection via the data.corpus option of the intial screen. d. qu (quit) returns control to the intial menu. The current tree can be reloaded by clicking action.resume. 2.11. The show group. With the exception of the general purpose option hi, this group mainly demonstrates Xbar-specific structural relations. The screens created in this group can be undone but not redone. Click on empty space if you want the hilighting removed. a. hi (hilight). Use this option if you want to hilight specific nodes. Clicking an already hilighted node resets the normal colour attribute. (The option is not available for keyboard-based input.) b. cc (c-command). Click cc and then any tree node. Different colour attributes hilight the scope of the c-command relation, the c-commanding node, the c-commanded nodes, and the nodes excepted from c-command. The implementation follows the operational definition given in Haegeman (1991:122): Start from node A and move upwards to the first branching node. Every node down (except those dominated by, or dominating, A) is a B c-commanded by A. In the following tree for the ungrammatical sentence *John will invite herself* (treecad.in.9), node *John* was clicked. As a result, all c-commanded nodes are hilighted on the screen (italicized below). IP ┌────┴─────┐ NP Ibar │ ┌────┴─────┐ │ I VP │ │ │ │ │ Vbar │ │ ┌───┴────┐ │ │ V NP │ │ │ │ │ │ │ │ John will invite herself Briefly, the sentence is ungrammatical because the anaphor *herself* should have a c-commanding binder. *John* is the only candidate, but *John* is excluded because of lacking gender concord. c. mc (m-command). This is similar to c-command except that the scope is slightly different: Go from node A upwards to the first maximal projection. Every node down from there is a B, m-commanded by A (except nodes dominating A, or dominated by A). (Haegeman 1991:125) This configurational property is mainly needed for the definition of the concept of government (for which see below). In the following tree (treecad.in.11), node *will* has been clicked. The italicized items indicate nodes m-commanded by *will*. IP ┌───┴────┐ NP Ibar │ ┌──┴────┐ [m-command relations not shown here] │ I VP │ │ │ │ │ Vbar │ │ ┌─┴───┐ │ │ V NP │ │ │ │ │ │ │ │ He will do it d. gv (government). Government is a crucial cofigurational property underlying a number of syntactic phenomena. The following definition (cp. Haegeman 1991:125) has been implemented: 1) A governs B if A m-commands B and no barrier intervenes between A and B. 2) Maximal projections except infinitival IP are barriers to government. 3) Governors are lexical nodes V, N, P, A and tensed I. Consider *We want him to do it* (treecad.in.12), below. If you click the V of *want*, the program hilights three nodes: the embedded IP, the NP *him* and the VP *do it*. IP ┌───┴─────┐ NP Ibar │ ┌───┴─────┐ │ I VP │ │ │ │ │ Vbar │ │ ┌───┴─────┐ │ │ V *IP* │ │ │ ┌───┴────┐ │ │ │ *NP* Ibar │ │ │ │ ┌──┴────┐ │ │ │ │ I *VP* │ │ │ │ │ │ │ │ │ │ │ Vbar │ │ │ │ │ ┌─┴───┐ │ │ │ │ │ V NP │ │ │ │ │ │ │ we +t1 want him to do it The important thing is that *want* governs into the infinitival IP, but not into the VP *do it*. As a consequence, *want* can function as the case-assigner of *him*. An infinitival (non-tensed) I is represented by either (I,to), as in the example above, or by (I,+t0). e. Gv (Passive government). This is basically the same as govern- ment, except that you click a potential governee in order to find its governor. As a counterpart to the ungrammatical sentence in 2.10.b, above, consider *John will like Mary's description of herself* (treecad.in.15). This sentence is grammatical because it obeys the Principle of Reflexive Binding, according to which A reflexive must be bound in the minimal domain containing it, its governor and a subject (Haegeman 1991:202). Click Gv, then the NP *herself* to determine its governor: it is the preposition *of*. IP ┌────┴─────┐ NP Ibar │ ┌────┴──────┐ │ I VP │ │ │ │ │ Vbar │ │ ┌─────┴───────┐ │ │ V NP │ │ │ ┌──────┴────────┐ │ │ │ NSp Nbar │ │ │ │ ┌────┴──────┐ │ │ │ NP N PP │ │ │ │ │ │ │ │ │ │ │ Pbar │ │ │ │ │ ┌──┴────┐ │ │ │ │ │ *P* NP │ │ │ │ │ │ │ John will like Mary's description of herself The subtree that includes the reflexive, the governor and the NP *Mary's* constitutes the minimal domain for reflexive binding here (cp. Haegeman 1991:201). 2.12. The ops group covers two types of Xbar specific operations: adjunction and movement. a. adj (adjunction) joins two solitary trees either on a bar level or on the level of a maximal projection. For the following example of a "bar adjunction" click adj, then PP and then Xbar. 1. XP PP 2. XP ┌──┴────┐ │ ┌─────┴──────┐ │ Xbar Pbar │ Xbar │ ┌─┴───┐ ┌─┴───┐ │ ┌────┴──────┐ │ │ │ │ │ │ Xbar PP │ │ │ │ │ │ ┌─┴───┐ │ │ │ │ │ │ │ │ │ Pbar │ │ │ │ │ │ │ │ ┌─┴───┐ │ │ │ │ │ │ │ │ │ │ XSp X YP P YP XSp X YP P YP ADJOIN<subtree>: PP TO<node>: Xbar If the subtree originally occurs on the left of the matrix tree it will be adjoined as a left branch. As an exercise, you may want to try out possible bar adjunctions for the ambiguous sentence *we saw the boy with the telescope* (treecad.in.18): IP PP ┌───┴─────┐ │ NP Ibar Pbar │ ┌───┴────┐ ┌───┴─────┐ │ I VP P NP │ │ │ │ ┌───┴────┐ │ │ Vbar │ NSp Nbar │ │ ┌──┴────┐ │ │ │ │ │ V NP │ │ N │ │ │ ┌─┴───┐ │ │ │ │ │ │ NSp Nbar │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ N │ │ │ │ │ │ │ │ │ │ │ we +t2 saw the boy with the telescope Similarly, for an adjunction to a maximal projection, click adj, then ZP and then XP: 1. XP ZP 2. XP ┌──┴────┐ │ ┌──────┴───────┐ │ Xbar Zbar XP ZP │ ┌─┴───┐ │ ┌──┴────┐ │ │ │ │ │ │ Xbar Zbar │ │ │ │ │ ┌─┴───┐ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ XSp X YP Z XSp X YP Z ADJOIN<subtree>: ZP TO<node>: XP For a concrete example, see 2.14 below. As with bar-adjunction, if the subtree originates on the left of the matrix tree it will be adjoined as a left branching adjunction. b. mov (move) is both a general editing tool as well as an Xbar- specific function. In its editing use, it moves a solitary tree to a terminal node of a matrix tree, an operation which is equivalent to combining a replacement copy and a cut. In configuration #1, below, click mov, then the root node Z, then the terminal node D: 1. A Z 2. A ┌──┴────┐ ┌─┴───┐ ┌────┴──────┐ │ C │ │ │ C │ ┌─┴───┐ │ │ │ ┌───┴─────┐ │ │ │ │ │ │ Z │ │ │ │ │ │ │ ┌─┴───┐ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ B D E X Y B X Y E MOVE<node>: Z TO<terminal>: D However, the main function of mov is Xbar-specific: it moves a constituent of a tree from one position to a suitable landing site, leaving a trace. The first tree below (treecad.in.24) represents the structure of the echo question *He did talk about what?* Click mov, then *did,* then C and you will get the structure of another echo question, *Did he talk about what?* 1. CP 2. CP ┌───┴─────┐ ┌────┴─────┐ │ Cbar │ Cbar │ ┌───┴─────┐ │ ┌───┴─────┐ │ │ IP │ │ IP │ │ ┌───┴─────┐ │ │ ┌───┴─────┐ │ │ NP Ibar │ │ NP Ibar │ │ │ ┌───┴─────┐ │ │ │ ┌───┴─────┐ │ │ │ I VP │ │ │ I VP │ │ │ │ │ │ │ │ │ │ │ │ │ │ Vbar │ │ │ │ Vbar │ │ │ │ ┌───┴────┐ │ │ │ │ ┌───┴────┐ │ │ │ │ V PP │ │ │ │ V PP │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ Pbar │ │ │ │ │ Pbar │ │ │ │ │ ┌─┴───┐ │ │ │ │ │ ┌─┴───┐ │ │ │ │ │ P NP │ │ │ │ │ P NP │ │ │ │ │ │ │ │ │ │ │ │ │ │ CSp C he did talk about what CSp did#1 he #1 talk about what MOVE<node>: did TO<terminal>: C As can be seen, the trace and the moved item are coindexed by the notation #n. As an exercise, continue moving *what* to Csp, deriving *What did he talk about?* Undo this step and derive the variant *About what did he talk?* These steps illustrate the movement patterns known as "preposition stranding" and "pied piping" (Haegeman 1991: 341). 2.13. Exploring German main clause patterns. Contrary to the apparent structural resemblance between main clauses in English and German (*Mary likes John - Marie mag Jan*) it is now generally assumed that German VPs and IPs have a head-last configuration. This is indeed borne out by the fact that verb definitions in German are spontaneously presented by native speakers as (object-)object-verb paradigms, e.g. *ein Buch kaufen,* *jemandem etwas geben* etc. The following example (*daß Jan bestimmt morgen das Buch kaufen wird,* treecad.in.30) may serve to illustrate the fact that it is the German subordinate clause structure which is the most productive base structure for deriving all kinds of main clauses: CP ┌────┴──────┐ │ Cbar │ ┌─────┴───────┐ │ C IP │ │ ┌───────┴─────────┐ │ │ NP Ibar │ │ │ ┌──────────┴────────────┐ │ │ │ AdvP Ibar │ │ │ │ ┌─────────┴──────────┐ │ │ │ │ VP I │ │ │ │ │ │ │ │ │ │ Vbar │ │ │ │ │ ┌─────┴───────┐ │ │ │ │ │ AdvP Vbar │ │ │ │ │ │ ┌────┴─────┐ │ │ │ │ │ │ NP V │ │ │ │ │ │ ┌─┴───┐ │ │ CSp daß Jan bestimmt morgen das Buch kaufen wird As a first step, cut off *daß*, so that two landing sites are available, CSp for phrasal structures, and C for head-to-head movement. Next, mov *wird* to C to obtain *Wird Jan bestimmt morgen das Buch kaufen?* Next, mov *bestimmt* to CSp (*Bestimmt wird Jan morgen ...*). Undo this version. Move *morgen* to CSp: *Morgen wird Jan bestimmt ...*. Undo that and, finally, derive *Jan wird bestimmt morgen das Buch kaufen.* 2.14. Another typical feature of Germanic languages is the phenomenon referred to as scrambling. Consider the derivation of *die Torte mit dem Messer schneiden* whose D-structure Haegeman (1991:540) takes to be (treecad.in.34): VP │ Vbar ┌─────────┴───────────┐ PP Vbar │ ┌────┴─────┐ Pbar NP V ┌───┴────┐ │ │ P NP │ │ │ │ │ │ │ │ │ │ mit dem Messer die Torte schneiden It appears that there is no suitable landing site for moving *die Torte* to a place in front of *mit dem Messer*. However, landing sites may be created by adjunction. To do this in TreeCad, generate a solitary NP tree on the left of the VP. Adjoin this to the VP and cut it down to the following shape: VP ┌────────┴──────────┐ │ VP │ │ │ Vbar │ ┌─────────┴───────────┐ │ PP Vbar │ │ ┌────┴─────┐ │ Pbar NP V │ ┌───┴────┐ │ │ │ P NP │ │ │ │ │ │ │ NP mit dem Messer die Torte schneiden *Die Torte* may now be moved to the newly created landing site: VP ┌────────┴──────────┐ NP#1 VP │ │ │ Vbar │ ┌───────┴─────────┐ │ PP Vbar │ │ ┌───┴────┐ │ Pbar │ V │ ┌───┴────┐ │ │ │ P NP │ │ │ │ │ │ │ die Torte mit dem Messer #1 schneiden 2.15. Associative transfer rules map parametric (language specific) structural features of one language into those of another (see Rolshoven 1991 for a discussion of the concept). As shown above, an important parametric difference between English and German is the fact that the former is an SVO language in which the heads of IP and VP phrases come before their complements, whilst the latter is an SOV language with head- last characteristics. Consider again the German D-structure for *Jan wird morgen das Buch kaufen* (treecad.in.36): CP ┌─────┴───────┐ │ Cbar │ ┌───────┴────────┐ │ │ IP │ │ ┌──────────┴───────────┐ │ │ NP Ibar │ │ │ ┌─────────┴──────────┐ │ │ │ VP I │ │ │ │ │ │ │ │ Vbar │ │ │ │ ┌─────┴───────┐ │ │ │ │ AdvP Vbar │ │ │ │ │ ┌────┴─────┐ │ │ │ │ │ NP V │ │ │ │ │ ┌─┴───┐ │ │ │ │ │ │ NSp Nbar │ │ │ │ │ │ │ │ │ │ CSp C Jan morgen das Buch kaufen wird As it happens, the transfer rule that maps this structural configuration into English syntax is TreeCad's mirror operation. First, mirror the innermost Vbar to exchange the positions of verb and object. Then mirror the higher Vbar to move the adverb to the end of the VP. Finally, mirror the Ibar to move the auxiliary into pre-VP position. CP ┌────┴─────┐ │ Cbar │ ┌────┴──────┐ │ │ IP │ │ ┌─────┴───────┐ │ │ NP Ibar │ │ │ ┌───────┴─────────┐ │ │ │ I VP │ │ │ │ │ │ │ │ │ Vbar │ │ │ │ ┌──────┴────────┐ │ │ │ │ Vbar AdvP │ │ │ │ ┌───┴────┐ │ │ │ │ │ V NP │ │ │ │ │ │ ┌─┴───┐ │ │ │ │ │ │ NSp Nbar │ │ │ │ │ │ │ │ │ CSp C Jan wird kaufen das Buch morgen John will buy the book tomorrow As a result we get a structural scaffold for *John will buy the book tomorrow.* 3. FOX - A Frame-Oriented X-bar Parser 3.1. Overview and uses. FOX processes simple English sentences and attempts to represent their syntactic structure in the form of X-bar phrase markers (Haegeman 1991). FOX is designed to run on DOS PCs with a VGA/EGA screen and a hard disk, preferably on 80386 or higher platforms. Operation with lesser processors is possible, but tends to be sluggish. Technically, the parser is a left-to-right, bottom-up, multipass nondeterministic parser. In the event of unresolvable lexical or structural ambiguity it attempts to produce all possible outcomes by backtracking. In its present form the parser can be used for the following purposes: - as an interactive demonstration package illustrating the automatic processing of a variety of core grammar (mostly textbook) cases; - as an exploratory model for investigating lexical subcategorization and syntactic ambiguity and developing disambiguation strategies. 3.2. Operation: DOS or Windows 3.1. FOX is not a proper Windows program. However, under Windows 3.1 (386 mode) it can be run as a "non-windows application in a window". Operation under Windows has a number of advantages such as access to a smoothly moving mouse pointer, resizeable system fonts, and the data exchange via the clipboard. Do the following steps to set up FOX for Windows 3.1: a. Copy the files Fox.ico and Fox.pif to the Windows 3.1 directory. b. In Windows, start the PIF-Editor. Click *File/new*. Click *browse* to locate and select Fox.pif. The only entries requiring any change are the lines specifying the ICON directory. Adjust this to whatever directory you are using for your TREE&FOX files. Exit the PIF-Editor, saving the changes. Invoke the Program Manager's *File/new* menu and OK the box *program item*. Enter "Fox.pif" as *commandline* text. Click *change icon*. Disregard the error message and click OK. Use *browse* to locate Fox.ico. Select it and click OK. The FOX icon will appear among the Program Manager's other program icons, and Fox is ready to run. c. Further hints: - There is an option to adjust font sizes in Fox's system menu field. The most suitable font sizes are 8x12 and 7x12. - The "edit" option lets you copy all or part of your Fox display to the clipboard. d. FOX may crash when the sysvars.max variable is set to a value larger than 10. This may be owing to a lack of memory or the fact that no ANSI.SYS driver is specified in the CONFIG.SYS file. For large values of max (i.e., 11..15), make sure to resize the window and select a suitable font. 3.3. Frame Orientation. The word "frame" in the parser's acronym goes back to Marvin Minsky's conceptual model of human recognition processes. His introductory definition of the key concepts is a good starting point: We can think of a frame as a network of nodes and relations. The "top levels" of a frame are fixed, and represent things that are always true about the supposed situation. The lower levels have many *terminals* - "slots" that must be filled by specific instances or data. Each terminal can specify conditions its assignments must meet. (The assignments themselves are usually smaller "sub-frames.") (Minsky 1975:1) 3.4. Linguistic Orientation. In keeping with Government and Binding (GB) theory conventions, the FOX parser attempts to assign X-bar S- structures which preserve their "underlying" D-structures. For its initial syntactic frames, FOX depends on lexical "subcategorization frames" (particularly those of verbs), and it capitalizes on the "projection principle" which posits that all low-level syntactic structure is based on lexical subcategorization (for details, see Haegeman 1991). Whilst elements of "theta theory" have been encapsulated in the sub- categorization frames of lexical entries, FOX is currently not aware of any of the other supplementary modular subtheories (e.g., case and binding) normally treated within GB theory. 3.5. Limitations. The parser's recognition capabilities are restricted to a purely syntactic level. To the parser, all sentences are like "The mome raths outgrabe" (from Lewis Carroll's "Jabberwocky"). Even for input such as this, human recognizers have an immeasurable advantage over the FOX parser because they assume intuitively that *mome *is an adjective, that *raths* is the plural form of a noun, and that *outgrabe* is a past tense of a verb *outgribe*. FOX must be given this information before it is able to perform a successful parse (the sentence is listed as fox.in.29). At present, the FOX parser's grounding in realistic language data is still extremely tenuous. Among the many features the parser does not know how to handle are compounds, conjunctions, negation, gerunds, phrasal verbs, tags and many other constructions. If you have inadvertently entered a sentence containing such "unknown" elements, then a bogus subcategorization category (such as X) may be used provisionally. (That won't crash the parser.) Alternatively, enter an empty string to cancel the processing of the sentence. 3.6. Trees. The following tree represents the last stage in the parser's processing of fox.in.56, *Which book will John give to Mary?* CP ┌──────┴───────┐ CSp Cbar │ ┌────┴─────┐ NP#2 C IP ┌─┴───┐ │ ┌────┴─────┐ NSp Nbar I#1 NP Ibar │ │ │ │ ┌────┴──────┐ wh N │ │ I VP │ │ │ │ │ │ │ │ │ │ │ Vbar │ │ │ │ │ ┌─────┼───────┐ │ │ │ │ │ V NP PP │ │ │ │ │ │ │ │ which book will John #1 give #2 to_Mary Note the following details: a. There are some slight terminological idiosyncrasies - in particular, "bars" are spelled out and the various conventional designations C'', C', Spec, CompSpec, SpecComp, Det etc. are not used. In the parser's notation, which is primarily motivated by ease of computational handling, any head category X has the projections X, Xbar and XP, and the specifier node is an XSp. b. Movements and traces are indicated by indices #1, #2, etc. c. The display depth of the tree shown is 8, and its virtual depth is 10, which means that some of its lower nonterminal nodes are not represented (in this case, note the elliptical prep phrase). The display depth can be adjusted from 4 to around 15. From display depth 11 upwards, the parser switches to display mode 80,43 (this may not work for all screens). Normal screen mode (80,25) is restored after regular program termination. d. Very observant readers will have noticed that the noun phrases *John* and *Mary* appear as plain NPs, whereas *book* is pro- jected fully. This is a subcategorizing option in the lexicon. 3.7. Invocation. The ready-to-run version of the FOX parser is started by typing iconx fox at the command prompt or by clicking the FOX icon in Windows' Program Manager. The initial menu comprises four options: ESC:quit ENTER:corpus SPACE:interactive mode s:SYSVARS a. Hit ENTER to view the corpus file fox.in. FOX runs the SCROLLER program to list this file; if this feature does not work, the parser can only be operated in interactive mode. Pick fox.in.1, *John will see Mary.* FOX looks up the words in its internal lexicon and presents the following initial ("given") structure: NP I IP VP NP │ │ ┌──┴────┐ │ │ │ │ │ Ibar Vbar │ │ │ │ ┌─┴───┐ ┌─┴───┐ │ │ │ │ │ │ V │ │ │ │ │ │ │ │ │ │ John will ?NP ?I ?VP see ?NP Mary The parser will now automatically continue with a series of more or less successful attempts to unify the material in a fully saturated single structural tree. Most intermediate results are obtained by procedures that "build" or "grab" or "trace" some- thing. Once the parse has run its course - either by bottoming out with a single tree structure or by getting stuck on a sequence of incompatible subtrees - press ENTER to return to the main menu. Repeat the process with some of the other sentences in the corpus file, if you like, or ESCAPE to exit the parser. Hint: Begin with the simple sentences in the corpus file in order to get an impression of the parser's operation. These sentences should all come out without any user intervention. Most of the sentences from fox.in.17 onwards contain material requiring interactive subcategorization (for which see below). b. The SYSVARS option allows you to adapt the following variables to specific requirements and circumstances: - Verbose. Normally OFF (0). If toggled to 1 all debugging writes are echoed on the screen and written to the protocol file. - Max is the depth of the tree. If it exceeds 11, Fox attempts to execute "mode 80,43" to set a 43 line display. The safe upper limit for max lies around 15. - Steps. The initial setting is *automatic step-by-step*. Two other settings can be toggled: (1) *user-prompted step-by-step*; (0) *no intermediate steps, final outcome only*. c. SPACE:interactive mode. The words listed in the lexicon are displayed on the screen, and the user is prompted to enter a sentence. There are a few simple ground rules: (1) Punctuation is allowed but will be ignored. (2) The parser is case sensitive but will test whether a lower case version of the first word in the sentence is listed in the lexicon. (3) There is no limit on the length of sentences, but the parser will trim the tree display to the 80 columns of the screen and has no horizontal scrolling facility. Output to the session protocol, however, is not truncated in this manner. (4) For words not listed in the lexicon, subcategorization frames will be requested from the user. 3.8. Session protocol. There is no way of undoing or replaying steps, but all trees generated are saved in a plain text file named fox.tmp. Note that the session file is overlaid (i.e., deleted) at the beginning of each FOX session. If any of the trees are to be saved, fox.tmp must be copied or renamed before FOX is restarted. Be warned that, for long sessions, fox.tmp can become quite large. 3.9. The lexicon file. A lexicon has been provided in the file fox-lex which can be edited with an ASCII editor. Its format is largely self-explan- atory, but note the following details: a. Since fox-lex is read when FOX is started and is kept in memory until program termination, its size must obviously remain within manageable proportions. (I am assuming there won't be much space left.) b. Blank lines are ignored, likewise lines beginning with the hash character (#). c. In the file dump shown below, the definition of the sub- categorizing frames begins in column 14. The word itself and its definition must be separated by at least two spaces. d. Words can be entered in any order. ### FOX LEXICON ### ### Auxiliaries be V ?NP_?NP?AP am +t1 be are +t1 be is +t1 be was +t2 be were +t2 be being +pt1 be been +pt2 be do V ?NP_?NP does +t1 do did +t2 do doing +pt1 do done +pt2 do have V ?NP_?NP has +t1 have had +t2+pt2 have having +pt1 have to P ### Adjectives able A _%CP1 green A little A lucky A _%CP1 wrong A _%CP1 ### Complementizers whether C that C ### Inflexionals will I would I ### Noun-Specifiers a NSp the NSp ### Nouns boy N book N _?of Friday N he NP it NP John NP London N Mary NP man N student N _?of we NP you NP Xself NP ### wh-words what NPwh who NPwh whom NPwh which NSpwh when PPwh ### Prepositions about P by P in P of P on P without P _%CP1 ### Verbs believe V ?NP_?NP?IP?CP believes +t1 believe believed +pt2 believe buy V ?NP_?NP,%NP give V ?NP_%NP,%NP?PP gave +t2 give given +pt2 give hate V ?NP_?NP hated +t2+pt2 hate invite V ?NP_?NP know V ?NP_?CP?NP leave V ?NP_%NP like V ?NP_?NP meet V ?NP_?NP met +t2 meet persuade V ?NP_?NP,?CP2 persuaded +t2+pt2 persuade promise V ?NP_%NP,?NP?CP1 promised +t2+pt2 promise promising +pt1+a promise read V ?NP_?NP?CP reading +pt1 read relax V ?NP_ relaxed +t2 relax resign V ?NP_ resigned +t2 resign see V ?NP_?NP?CP saw +t2 see seeing +pt1+a see seen +pt2 see seem V ?ISp_?IP seemed +t2 seem talk V ?NP_?about want V ?NP_?NP?IP?CP1 wonder V ?NP_?CP1 wondered +t2 wonder ### Multiple Cats big N;A sleep V ?NP_;N 3.10. Notes on subcategorization. a. Many lexical items can simply be subcategorized by specifying their grammatical category (cp. the entries for *green, that, boy,* etc.). In some cases, unnecessary X-bar structure may be suppressed by directly specifying the maximal projection (cp. *John, we*). b. The notation N _?of in the case of *student* indicates that we want the parser to treat an *of*-PP following *student* as a complement. c. Verbs are major structure determiners, and the parser will use the subcategorization information of each verb to hypothesize two major structures: a clausal IP frame and a verb phrase frame, the latter to be slotted into the IP frame at a later stage. For the parser, verbs are subcategorized according to the number and type of their external and internal arguments (their theta grid). The external argument of a verb is usually a subject noun phrase in the specifier position of an IP. Internal arguments are noun phrases, clausal phrases or prep phrases in the complement position of a verb phrase (cp. the entries for *resign, invite, talk*). Variant complement categories are simply concatenated (cp. *believe*). d. As for inflected verb forms, +t1 denotes present tense, +t2 past tense, +pt1 the present participle, +pt2 the past participle. If a tensed or participle form can be used as an adjunct (as in *a promising boy*), specify +a. e. Optional or implicit theta roles are experimentally flagged by the notation %XP (cp. *give*). f. For items exerting subject control (*promise*), specify an ?XP1 complememt. For object control items (*persuade*), specify ?XP2. g. Note the special subcategorization for the raising verb *seem*. h. Lexical ambiguity is indicated by concatenating several sub- categorization frames and separating them by a semicolon (cp. *big* and *sleep*). The parser's (inefficient) heuristic for dealing with multiple subcategorization is backtracking (see para 3.13). i. Certain trivial subcategorization detail is handled automatically by the parser. For instance, it is not necessary to specify NP comple- ments for prepositions. Words ending in *-ly* are taken to be adverbs. Words ending in *'s* are processed as genitive case NPs. The parser can also usually recognize passives and proceed accordingly. Also, the parser's lookup procedure makes a decision on whether a word such as *hated* is a tensed form or the past participle of *hate* in the context given. 3.11. Sample dialogue. The following is a typical interactive dialogue in which the parser requests an additional subcategorization frame (user input italicized): *John kissed Mary.* looking up: John kissed Mary NO LEX ENTRY FOR: kissed SUBCATEGORIZE: *+t2+pt2 kiss* NO LEX ENTRY FOR: kiss SUBCATEGORIZE: *V ?NP_?NP* Interactively subcategorized items are added to the lexicon for the duration of the session and will not be newly requested in subsequent occurrences. Interactive additions to the lexicon are temporary, and on leaving the program the added words and their definitions are forgotten. There is no provision for interactively retracting or changing entries. 3.12. Handling of adjuncts. The parser has no sophisticated heuristics for placing adjuncts. Thus for the notorious *we saw the boy with the telescope* (fox.in.20), the parser will just leave the PP stranded. On the other hand, the parser can be given a cue as to where to attach the PP by changing the input either to *we saw the boy (Nbar,?PP) with the telescope* or to *we saw the boy (Vbar,?PP) with the telescope.* See the corpus file for a number of similar cases. 3.13. Lexical ambiguity. The parser's processing of ambiguous strings can be illustrated by letting it parse fox.in.27, *the big sleep*. As shown in the dump of fox-lex, *big* has been subcategorized as N;A (i.e. both for a noun and and adjective), and *sleep* has been subcategorized V ?NP_;N, i.e., both as an intransitive verb and a noun. Four outcomes are possible, two of which, namely *big1+sleep1* and *big2+sleep2*, succeed, whilst the other two, *big1+sleep2* and *big2+sleep1*, fail. To observe the parsing strategy in detail, set the steps SYSVAR to 1. 4. References Carroll, Lewis. Through the Looking-Glass. In The Annotated Alice, ed. Martin Gardner. Harmondsworth: Penguin, 1974 [1896]. Fanselow, Gisbert/Sascha W. Felix. 1990. Die Rektions- und Bindungs- theorie. Tübingen: Francke. Griswold, Ralph E./Madge T. Griswold. 1990. The ICON Programming Language: Second Edition. Englewood Cliffs: Prentice Hall. Griswold, Ralph. 1992. Version 8.5 of Icon for MS-DOS 386/486 Platforms. The U. of Arizona Icon Project, Doc. IPD184. [See note on ICON, below.] Haegeman, Liliane. 1991. Introduction to Government and Binding Theory. Cambridge, Mass.: Blackwell. Minsky, Marvin. 1975. "A Framework for Representing Knowledge". Frame conceptions and text understanding, ed. Dieter Metzing. Berlin: deGruyter. Radford, Andrew. 1988. Transformational Grammar: A First Course. Cambridge U.P. Rolshoven, Jürgen. 1991. "GB und sprachliche Informationsverarbeitung mit LPS". Romanische Computerlinguistik: Theorien und Implemen- tationen, ed. J. Rolshoven and D. Seelbach. Tübingen: Niemeyer. Note on ICON: All main program modules in TREE & FOX were implemented using the sophisticated features offered by the ICON programming language. Griswold & Griswold (1990) is the primary reference text. The University of Arizona publishes a monthly Icon Newsletter as well as a bi-monthly technical report called The Icon Analyst. Icon has been ported to practically all types of platforms and operating systems. For subscription and ordering details, contact The Icon Project, Dept. of Computer Science, Gould-Simpson Building, The University of Arizona, Tucson AZ 85721, U.S.A.